The Circular Drift Diffusion Model

The circular drift-diffusion model (CDDM) is a stochastic sequential sampling model that describes choices and response times observed in tasks with a circular decision space (Smith, 2016). Like many other accumulator models, the CDDM assumes that participants accumulate information in favor of a given response option over time, until a response criteria is met. This is characterized as a random walk that starts at the origin of a circle representing the decision space, and that moves towards its circumference. Once the circumference is reached, the radian corresponding to the point of intersection and the amount of steps it took to get there are indicative of the choice and response times observed.

The CDDM considers four parameters, most of which are illustrated in Fig 1. The parameter not shown is the nondecision time parameter \(\tau\). The remaining parameters are the boundary radius parameter (\(\eta\)) that serves as a response criterion, and a doublet of parameters describing the overall speed and direction of the information accumulation process. These last two parameters can be expressed in cartesian coordinates (i.e., the mean displacement observed on the X and Y coordinates per unit of time \(\{\mu_x, \mu_y\}\)) or polar coordinates (i.e., the average speed and expected direction \(\{\delta,\theta\}\)).

Fig 1. Graphical illustration of the CDDM

Fig 1. Graphical illustration of the CDDM

dCDDM(*) - CDDM density function

source("./code/cddm/dCDDM.R")

choice <- 5; rt <- 1
dCDDM(c(choice,rt),drift,theta,tzero,boundary)
## [1] 0.9630938

pCDDM(*) - CDDM cumulate density function

source("./code/cddm/pCDDM.R")

x <- pCDDM(data_fig1, drift, theta, tzero, boundary, plot=TRUE)

Illustrative parameter set

In this document, we present a comparison between different sampling algorithms that generate data under the CDDM. We will describe how each of these algorithms work and highlight how different they are in terms of the computation time they require and their accuracy. For starters, we will generate n = 5000 bivariate observations using the arbitrary set of parameter values shown below:

# Arbitrary set of parameter values
par <- list("drift" = 3.5, 
            "theta" = pi,
            "tzero" = 0.2,
            "boundary" = 2)
n <- 5000 # No. samples

Algorithm 1: Random walk emulator

Each pair of observations is obtained by emulating the random walk process described by the CDDM. We emulate the information accumulation process by iteratively sampling random movements on the X and Y direction using \(\mu_x\) and \(\mu_y\), until the

source("./code/cddm/sim_randomWalk.R")
X.RW <- sample.RW.cddm(n,par)
plot.CDDM_choiceData(X.RW, par, choice.col.RGB = rgb.RW)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.RW$bivariate.data[,2], main = "Response Times", col = col.RW(0.7))

plot.CDDM_margECDF(X.RW$bivariate.data, color=col.RW(0.3))

The execution of the Random Walk emulator algorithm took approximately 6.7114 seconds.

Algorithm 2: Rejection sampling algorithm

source("./code/cddm/sim_Reject.R")
X.Reject <- sample.Reject.cddm(n,par,plot=TRUE)
Fig.2 Visual representation of the Reject sampling algorithm. The blue lines represent the joint density function of response times and choices prescribed by the CDDM. The dots represent the pairs of candidate value generated from the bivariate space of possible RTs and choices, plotted at the height of the random uniform(0, maxDensity) value used as rejection criterion. If the rejection criterion was lower than the density function (i.e., dot falls below density curve), the sample is accepted; if the rejection value surpasses the density (i.e., dot is drawn above), the candidate is rejected. The process of generating, testing and keeping candidates is repeated until the desired sample size is achieved.

Fig.2 Visual representation of the Reject sampling algorithm. The blue lines represent the joint density function of response times and choices prescribed by the CDDM. The dots represent the pairs of candidate value generated from the bivariate space of possible RTs and choices, plotted at the height of the random uniform(0, maxDensity) value used as rejection criterion. If the rejection criterion was lower than the density function (i.e., dot falls below density curve), the sample is accepted; if the rejection value surpasses the density (i.e., dot is drawn above), the candidate is rejected. The process of generating, testing and keeping candidates is repeated until the desired sample size is achieved.

plot.CDDM_choiceData(X.Reject, par, choice.col.RGB = rgb.Reject)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.Reject[,2], main = "Response Times", col=col.Reject(0.7))

plot.CDDM_margECDF(X.Reject, color=col.Reject(0.3))

The execution of the Rejection sampling algorithm took approximately 2.1625 seconds.

Algorithm 3: Metropolis algorithm

The following Metropolis algorithm uses the density function to generate random observations under the CDDM. Please read the comments I’ve left through the code to get a better idea on how this algorithm works. In order to work, this algorithm only needs a list par that specifies the values to use for each of the four parameters of the model (in either of its parameterizations, using polar or cartesian coordinates).

source("./code/cddm/sim_Metropolis.R")
X.Metro <- sample.Metropolis.cddm(n,par,plot=TRUE)
plot.CDDM_choiceData(X.Metro, par, choice.col.RGB = rgb.Metro)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.Metro[,2], main = "Response Times", col=col.Metro(0.7))

plot.CDDM_margECDF(X.Metro, color=col.Metro(0.3))

The execution of the Metropolis sampling algorithm took approximately 1.2355 seconds.

Algorithm 4: Inverse CDF (grid approximation)

source("./code/cddm/sim_invCDF.R")
X.invCDF <- sample.invCDF.cddm(n, par)

plot.CDDM_choiceData(X.invCDF, par, choice.col.RGB = rgb.invCDF)

par(pty="m", mar = c(3, 3, 3, 0)) 
hist(X.invCDF[,2], main = "Response Times", col = col.invCDF(0.7))

plot.CDDM_margECDF(X.invCDF, color=col.invCDF(0.3))

The execution of the inverse-CDF algorithm took approximately 9.5287 seconds.

Algorithm 5: Independent sampling

Future work:

Accuracy testing

Step 1: Obtain empirical CDF (eCDF)

# Load file containing custom eCDF function
source("./code/general_functions/eCDF.R")

RW.eCDF <- myECDF(X.RW$bivariate.data)
Reject.eCDF <- myECDF(X.Reject)
Metro.eCDF  <- myECDF(X.Metro)
invCDF.eCDF <- myECDF(X.invCDF)

Step 2: Approximate the theoretical CDF (tCDF)

RW.tCDF <- pCDDM(X.RW$bivariate.data, par$drift, par$theta, par$tzero, par$boundary)
Reject.tCDF <- pCDDM(X.Reject, par$drift, par$theta, par$tzero, par$boundary)
Metro.tCDF <- pCDDM(X.Metro, par$drift, par$theta, par$tzero, par$boundary)
invCDF.tCDF <- pCDDM(X.invCDF, par$drift, par$theta, par$tzero, par$boundary)

Obtaining the approximate CDF for the data generated with each sampling algorithm took between 1.421 and 1.4773 seconds.

Step 3: Compare eCDFs vs approximate tCDF

# We build a simple function to compare these CDFs
getDifferences <- function(eCDFs,tCDFs){
  difference <- tCDFs - eCDFs
  difference.sum <- apply(difference,2,sum)
  KS_statistic   <- apply(abs(difference),2,max) # Kolmogorov–Smirnov statistic
  sq.difference  <- apply(difference^2,2,sum)
  
  output <- cbind(difference.sum, KS_statistic, sq.difference)
  colnames(output) <- c("sumDiff","KS-statistic","SSDiff") 
  rownames(output) <- sub("\\..*", "", colnames(eCDFs))
  return(output)
}
eCDFs <- cbind(RW.eCDF, Reject.eCDF, Metro.eCDF, invCDF.eCDF)
tCDFs <- cbind(RW.tCDF, Reject.tCDF, Metro.tCDF, invCDF.tCDF)

getDifferences(eCDFs,tCDFs)
##            sumDiff KS-statistic      SSDiff
## RW        8.304419   0.02146699   0.1574057
## Reject   -3.824839   0.01530811   0.0706437
## Metro   -27.830784   0.03201822   0.4104939
## invCDF 1487.738574   1.00002528 573.7454942

Comparing computation times

compareTime <- function(n.Samples, n.Datasets, par){
    empty_matrix <- matrix(NA, nrow=n.Datasets, ncol=length(n.Samples))
    times.RW  =  times.Reject = times.Metro = times.invCDF = empty_matrix
    seed <- 1
    for(m in 1:n.Datasets){ 
        set.seed(seed)
        i <- 1
        for(n in n.Samples){
            start_time <- Sys.time()
            x <- sample.RW.cddm(n,par)
            end_time <- Sys.time()
            times.RW[m,i] <- round(end_time-start_time,4)
            start_time <- Sys.time()
            
            sample.Reject.cddm(n,par,plot=FALSE)
            end_time <- Sys.time()
            times.Reject[m,i] <- round(end_time-start_time, 4)
            start_time <- Sys.time()
            sample.Metropolis.cddm(n,par,plot=FALSE)
            end_time <- Sys.time()
            times.Metro[m,i] <- round(end_time-start_time,4)
            i <- i+1
            seed <- seed+1
        }
    }
return(list("times.RW" = times.RW,
            "times.Reject" = times.Reject,
            "times.Metro" = times.Metro))
}

Second example

Let’s briefly explore the results we get when we try a second set of parameter values where we have a much larger response boundary (and a rather slower random walk process).

# Arbitrary set of parameter values
par2 <- list("drift" = 1,   
            "theta" = pi,
            "tzero" = 0.1,
            "boundary" = 7)
n <- 5000

##            sumDiff KS-statistic      SSDiff
## RW      -11.973638   0.01785320   0.1218543
## Reject   -9.570871   0.02068502   0.2129392
## Metro   123.711253   0.16287738  17.6833615
## invCDF 1491.304688   0.99392550 566.2066087

References

  • Smith, P. L. (2016). Diffusion theory of decision making in continuous report. Psychological Review, 123(4), 425.
  • Smith, P. L., Garrett, P. M., & Zhou, J. (2023). Obtaining stable predicted distributions of response times and decision outcomes for the circular diffusion model. Computational Brain & Behavior, 6(4), 543-555.